Overview

Dataset statistics

Number of variables8
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory31.4 KiB
Average record size in memory64.3 B

Variable types

Numeric7
Categorical1

Alerts

GRE Score is highly correlated with TOEFL Score and 6 other fieldsHigh correlation
TOEFL Score is highly correlated with GRE Score and 5 other fieldsHigh correlation
University Rating is highly correlated with GRE Score and 5 other fieldsHigh correlation
SOP is highly correlated with GRE Score and 5 other fieldsHigh correlation
LOR is highly correlated with GRE Score and 5 other fieldsHigh correlation
CGPA is highly correlated with GRE Score and 6 other fieldsHigh correlation
Research is highly correlated with GRE Score and 2 other fieldsHigh correlation
Chance of Admit is highly correlated with GRE Score and 6 other fieldsHigh correlation
GRE Score is highly correlated with TOEFL Score and 6 other fieldsHigh correlation
TOEFL Score is highly correlated with GRE Score and 5 other fieldsHigh correlation
University Rating is highly correlated with GRE Score and 5 other fieldsHigh correlation
SOP is highly correlated with GRE Score and 5 other fieldsHigh correlation
LOR is highly correlated with GRE Score and 5 other fieldsHigh correlation
CGPA is highly correlated with GRE Score and 6 other fieldsHigh correlation
Research is highly correlated with GRE Score and 2 other fieldsHigh correlation
Chance of Admit is highly correlated with GRE Score and 6 other fieldsHigh correlation
GRE Score is highly correlated with TOEFL Score and 2 other fieldsHigh correlation
TOEFL Score is highly correlated with GRE Score and 4 other fieldsHigh correlation
University Rating is highly correlated with TOEFL Score and 3 other fieldsHigh correlation
SOP is highly correlated with TOEFL Score and 4 other fieldsHigh correlation
LOR is highly correlated with SOPHigh correlation
CGPA is highly correlated with GRE Score and 4 other fieldsHigh correlation
Chance of Admit is highly correlated with GRE Score and 4 other fieldsHigh correlation
GRE Score is highly correlated with TOEFL Score and 4 other fieldsHigh correlation
TOEFL Score is highly correlated with GRE Score and 6 other fieldsHigh correlation
University Rating is highly correlated with GRE Score and 5 other fieldsHigh correlation
SOP is highly correlated with TOEFL Score and 4 other fieldsHigh correlation
LOR is highly correlated with TOEFL Score and 3 other fieldsHigh correlation
CGPA is highly correlated with GRE Score and 5 other fieldsHigh correlation
Research is highly correlated with GRE Score and 3 other fieldsHigh correlation
Chance of Admit is highly correlated with GRE Score and 6 other fieldsHigh correlation

Reproduction

Analysis started2022-08-10 11:32:49.842890
Analysis finished2022-08-10 11:32:58.726541
Duration8.88 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

GRE Score
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct50
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean316.5587629
Minimum290
Maximum340
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-08-10T17:02:58.844186image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum290
5-th percentile298
Q1309
median316.5587629
Q3324
95-th percentile335
Maximum340
Range50
Interquartile range (IQR)15

Descriptive statistics

Standard deviation11.10395182
Coefficient of variation (CV)0.03507706348
Kurtosis-0.6124575385
Mean316.5587629
Median Absolute Deviation (MAD)7.558762887
Skewness-0.05247488122
Sum158279.3814
Variance123.297746
MonotonicityNot monotonic
2022-08-10T17:02:58.980198image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31222
 
4.4%
32422
 
4.4%
32217
 
3.4%
32117
 
3.4%
31617
 
3.4%
32717
 
3.4%
31416
 
3.2%
32016
 
3.2%
31116
 
3.2%
316.558762915
 
3.0%
Other values (40)325
65.0%
ValueCountFrequency (%)
2902
 
0.4%
2931
 
0.2%
2942
 
0.4%
2955
1.0%
2965
1.0%
2976
1.2%
29810
2.0%
2998
1.6%
30012
2.4%
30110
2.0%
ValueCountFrequency (%)
3409
1.8%
3393
 
0.6%
3384
0.8%
3372
 
0.4%
3365
1.0%
3354
0.8%
3347
1.4%
3334
0.8%
3327
1.4%
3319
1.8%

TOEFL Score
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct30
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean107.1877551
Minimum92
Maximum120
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-08-10T17:02:59.123617image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum92
5-th percentile98
Q1103
median107
Q3112
95-th percentile118
Maximum120
Range28
Interquartile range (IQR)9

Descriptive statistics

Standard deviation6.051337905
Coefficient of variation (CV)0.05645549624
Kurtosis-0.616646021
Mean107.1877551
Median Absolute Deviation (MAD)4
Skewness0.1030976372
Sum53593.87755
Variance36.61869044
MonotonicityNot monotonic
2022-08-10T17:02:59.263466image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
11042
 
8.4%
10537
 
7.4%
10429
 
5.8%
10728
 
5.6%
11227
 
5.4%
10626
 
5.2%
10325
 
5.0%
10024
 
4.8%
10224
 
4.8%
9922
 
4.4%
Other values (20)216
43.2%
ValueCountFrequency (%)
921
 
0.2%
932
 
0.4%
942
 
0.4%
953
 
0.6%
966
 
1.2%
977
 
1.4%
9810
2.0%
9922
4.4%
10024
4.8%
10119
3.8%
ValueCountFrequency (%)
1209
 
1.8%
11910
 
2.0%
11810
 
2.0%
1178
 
1.6%
11616
3.2%
11511
2.2%
11418
3.6%
11318
3.6%
11227
5.4%
11120
4.0%

University Rating
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.121649485
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-08-10T17:02:59.422858image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.128801908
Coefficient of variation (CV)0.3616043101
Kurtosis-0.7666145441
Mean3.121649485
Median Absolute Deviation (MAD)1
Skewness0.09244548876
Sum1560.824742
Variance1.274193748
MonotonicityNot monotonic
2022-08-10T17:02:59.512867image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
3154
30.8%
2124
24.8%
4103
20.6%
572
14.4%
132
 
6.4%
3.12164948515
 
3.0%
ValueCountFrequency (%)
132
 
6.4%
2124
24.8%
3154
30.8%
3.12164948515
 
3.0%
4103
20.6%
572
14.4%
ValueCountFrequency (%)
572
14.4%
4103
20.6%
3.12164948515
 
3.0%
3154
30.8%
2124
24.8%
132
 
6.4%

SOP
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.374
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-08-10T17:02:59.601541image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.5
Q12.5
median3.5
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation0.9910036208
Coefficient of variation (CV)0.2937177299
Kurtosis-0.7057169536
Mean3.374
Median Absolute Deviation (MAD)0.5
Skewness-0.2289723963
Sum1687
Variance0.9820881764
MonotonicityNot monotonic
2022-08-10T17:02:59.676298image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
489
17.8%
3.588
17.6%
380
16.0%
2.564
12.8%
4.563
12.6%
243
8.6%
542
8.4%
1.525
 
5.0%
16
 
1.2%
ValueCountFrequency (%)
16
 
1.2%
1.525
 
5.0%
243
8.6%
2.564
12.8%
380
16.0%
3.588
17.6%
489
17.8%
4.563
12.6%
542
8.4%
ValueCountFrequency (%)
542
8.4%
4.563
12.6%
489
17.8%
3.588
17.6%
380
16.0%
2.564
12.8%
243
8.6%
1.525
 
5.0%
16
 
1.2%

LOR
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.484
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-08-10T17:02:59.829886image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median3.5
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9254495739
Coefficient of variation (CV)0.2656284655
Kurtosis-0.7457485106
Mean3.484
Median Absolute Deviation (MAD)0.5
Skewness-0.1452903146
Sum1742
Variance0.8564569138
MonotonicityNot monotonic
2022-08-10T17:02:59.912608image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
399
19.8%
494
18.8%
3.586
17.2%
4.563
12.6%
2.550
10.0%
550
10.0%
246
9.2%
1.511
 
2.2%
11
 
0.2%
ValueCountFrequency (%)
11
 
0.2%
1.511
 
2.2%
246
9.2%
2.550
10.0%
399
19.8%
3.586
17.2%
494
18.8%
4.563
12.6%
550
10.0%
ValueCountFrequency (%)
550
10.0%
4.563
12.6%
494
18.8%
3.586
17.2%
399
19.8%
2.550
10.0%
246
9.2%
1.511
 
2.2%
11
 
0.2%

CGPA
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct184
Distinct (%)36.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.57644
Minimum6.8
Maximum9.92
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-08-10T17:03:00.022299image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum6.8
5-th percentile7.638
Q18.1275
median8.56
Q39.04
95-th percentile9.6
Maximum9.92
Range3.12
Interquartile range (IQR)0.9125

Descriptive statistics

Standard deviation0.6048128003
Coefficient of variation (CV)0.07052026253
Kurtosis-0.5612783981
Mean8.57644
Median Absolute Deviation (MAD)0.46
Skewness-0.02661251732
Sum4288.22
Variance0.3657985234
MonotonicityNot monotonic
2022-08-10T17:03:00.175321image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.769
 
1.8%
89
 
1.8%
8.127
 
1.4%
8.457
 
1.4%
8.547
 
1.4%
8.567
 
1.4%
8.656
 
1.2%
7.886
 
1.2%
9.116
 
1.2%
9.046
 
1.2%
Other values (174)430
86.0%
ValueCountFrequency (%)
6.81
0.2%
7.21
0.2%
7.211
0.2%
7.231
0.2%
7.251
0.2%
7.281
0.2%
7.31
0.2%
7.342
0.4%
7.361
0.2%
7.41
0.2%
ValueCountFrequency (%)
9.921
 
0.2%
9.911
 
0.2%
9.872
0.4%
9.861
 
0.2%
9.821
 
0.2%
9.83
0.6%
9.781
 
0.2%
9.762
0.4%
9.741
 
0.2%
9.72
0.4%

Research
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
280 
0
220 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters500
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1280
56.0%
0220
44.0%

Length

2022-08-10T17:03:00.353685image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-10T17:03:00.438119image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
1280
56.0%
0220
44.0%

Most occurring characters

ValueCountFrequency (%)
1280
56.0%
0220
44.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number500
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1280
56.0%
0220
44.0%

Most occurring scripts

ValueCountFrequency (%)
Common500
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1280
56.0%
0220
44.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII500
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1280
56.0%
0220
44.0%

Chance of Admit
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct61
Distinct (%)12.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.72174
Minimum0.34
Maximum0.97
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2022-08-10T17:03:00.577108image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.34
5-th percentile0.47
Q10.63
median0.72
Q30.82
95-th percentile0.94
Maximum0.97
Range0.63
Interquartile range (IQR)0.19

Descriptive statistics

Standard deviation0.141140404
Coefficient of variation (CV)0.1955557458
Kurtosis-0.4546817998
Mean0.72174
Median Absolute Deviation (MAD)0.1
Skewness-0.28996621
Sum360.87
Variance0.01992061363
MonotonicityNot monotonic
2022-08-10T17:03:00.691767image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.7123
 
4.6%
0.6419
 
3.8%
0.7318
 
3.6%
0.7216
 
3.2%
0.7916
 
3.2%
0.7815
 
3.0%
0.7614
 
2.8%
0.6213
 
2.6%
0.9413
 
2.6%
0.713
 
2.6%
Other values (51)340
68.0%
ValueCountFrequency (%)
0.342
 
0.4%
0.362
 
0.4%
0.371
 
0.2%
0.382
 
0.4%
0.391
 
0.2%
0.424
0.8%
0.431
 
0.2%
0.443
0.6%
0.453
0.6%
0.465
1.0%
ValueCountFrequency (%)
0.974
 
0.8%
0.968
1.6%
0.955
 
1.0%
0.9413
2.6%
0.9312
2.4%
0.929
1.8%
0.9110
2.0%
0.99
1.8%
0.8911
2.2%
0.884
 
0.8%

Interactions

2022-08-10T17:02:57.287431image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:50.126533image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:51.514452image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:52.596791image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:53.886362image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:55.056769image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:56.073941image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:57.444917image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:50.259342image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:51.670659image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:52.771945image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:54.058798image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:55.202522image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:56.245219image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:57.598851image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:50.409520image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:51.835118image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:52.919828image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:54.246066image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:55.326765image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:56.364514image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:57.750420image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:50.563632image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:51.988069image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:53.118988image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:54.448783image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:55.503233image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:56.540210image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:57.933806image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:50.735642image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:52.164232image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:53.324996image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:54.600521image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:55.666435image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:56.717407image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:58.054861image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:50.871080image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:52.333936image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:53.493188image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:54.742137image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:55.803883image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:57.006236image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:58.188599image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:51.052250image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:52.468474image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:53.697137image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:54.871587image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:55.929999image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-10T17:02:57.131505image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-08-10T17:03:00.881626image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-10T17:03:01.015321image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-10T17:03:01.252534image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-10T17:03:01.373391image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-10T17:02:58.433378image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-10T17:02:58.627217image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

GRE ScoreTOEFL ScoreUniversity RatingSOPLORCGPAResearchChance of Admit
0337.000000118.04.0000004.54.59.6510.92
1324.000000107.04.0000004.04.58.8710.76
2316.558763104.03.0000003.03.58.0010.72
3322.000000110.03.0000003.52.58.6710.80
4314.000000103.02.0000002.03.08.2100.65
5330.000000115.05.0000004.53.09.3410.90
6321.000000109.03.1216493.04.08.2010.75
7308.000000101.02.0000003.04.07.9000.68
8302.000000102.01.0000002.01.58.0000.50
9323.000000108.03.0000003.53.08.6000.45

Last rows

GRE ScoreTOEFL ScoreUniversity RatingSOPLORCGPAResearchChance of Admit
490307.0105.02.02.54.58.1210.67
491297.099.04.03.03.57.8100.54
492298.0101.04.02.54.57.6910.53
493300.095.02.03.01.58.2210.62
494301.099.03.02.52.08.4510.68
495332.0108.05.04.54.09.0210.87
496337.0117.05.05.05.09.8710.96
497330.0120.05.04.55.09.5610.93
498312.0103.04.04.05.08.4300.73
499327.0113.04.04.54.59.0400.84